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Complexity certifications of first order inexact 
Lagrangian methods for general convex 
programming 


Ion Necoara, Andrei Patrascu and Angelia Nedic 


Abstract In this chapter we derive computational complexity certifications of first 
order inexact dual methods for solving general smooth constrained convex problems 
which can arise in real-time applications, such as model predictive control. When it 
is difficult to project on the primal constraint set described by a collection of gen¬ 
eral convex functions, we use the Lagrangian relaxation to handle the complicated 
constraints and then, we apply dual (fast) gradient algorithms based on inexact dual 
gradient information for solving the corresponding dual problem. The iteration com¬ 
plexity analysis is based on two types of approximate primal solutions: the primal 
last iterate and an average of primal iterates. We provide sublinear computational 
complexity estimates on the primal suboptimality and constraint (feasibility) viola¬ 
tion of the generated approximate primal solutions. In the final part of the chapter, 
we present an open-source quadratic optimization solver, referred to as DuQuad, for 
convex quadratic programs and for evaluation of its behavior. The solver contains 
the C-language implementations of the analyzed algorithms. 


1 Introduction 


Nowadays, many engineering applications can be posed as general smooth con¬ 
strained convex problems. Several important applications that can be modeled in 
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this framework have attracted great attention lately, such as model predictive control 
for dynamical linear systems and its dual (often referred to as moving horizon esti¬ 
mation) |1^[TT1[T7][T8]|2^ , DC optimal power flow problem for power systems ||22|, 
and network utility maximization problems ll^ . Notably, the recent advances in 
hardware and numerical optimization made it possible to solve linear model pre¬ 
dictive control problems of nontrivial sizes within microseconds even on hardware 
platforms with limited computational power and memory. 

In this chapter, we are particularly interested in real-time linear model predictive 
control (MPC) problems. For MFC, the corresponding optimal control problem can 
be recast as a smooth constrained convex optimization problem. There are numerous 
ways in which this problem can be solved. For example, an interior point method has 
been proposed in lfT9l and an active set method was described in Q. Also, explicit 
MPC has been proposed in ||2|, where the optimization problem is solved off-line for 
all possible states. In real-time (or on-line) applications, these methods can some¬ 
times fail due to their overly complex iterations in the case of interior point and 
active set methods, or due to the large dimensions of the problem in the case of ex¬ 
plicit MFCs. Additionally, when embedded systems are employed, computational 
complexities need to be kept to a minimum. As a result, second order algorithms 
(e.g. interior point), which most often require matrix inversions, are usually left out. 
In such applications, first order algorithms are more suitable iiiioiiniiniEa es¬ 
pecially for instances when computation power and memory is limited. For many 
optimization problems arising in engineering applications, such as real-time MFCs, 
the constraints are overly complex, making projections on these sets computation¬ 
ally prohibitive. This is most often the main impediment of applying first order 
methods on the primal optimization problem. To circumvent this, the dual approach 
is considered by forming the dual problem, whereby the complex constraints are 
moved into the objective function, thus rendering much simpler constraints for the 
dual variables, often being only the non-negative orthant. Therefore, we consider 
dual first order methods for solving the dual problem. The computational complex¬ 
ity certification of gradient-based methods for solving the (augmented) Lagrangian 
dual of a primal convex problem is studied e.g. in lfTl[3ll5ll7 UT0lfT^fT7ll . Flowever, 
these papers either threat quadratic problems ini or linearly constrained smooth 
convex problems with simple objective function fflCl, or the approximate primal 
solution is generated through averaging lISUTOlfThl . On the other hand, in practice 
usually the last primal iterate is employed. There are few attempts to derive the 
computational complexity of dual gradient based methods using as an approximate 
primal solution the last iterate of the algorithm for particular cases of convex prob¬ 
lems nnEi . Moreover, from our practical experience we have observed that usually 
these methods converge faster in the primal last iterate than in a primal average se¬ 
quence. These issues motivate our work here. 

Contribution. In this chapter, we analyze the computational complexity of dual first 
order methods for solving general smooth constrained convex problems. Contrary 
to most of the results from the literature Mll7l9lll6lfT7l . our approach allows us to use 
inexact dual gradient information. Another important feature of our approach is that 
we also provide complexity results for the primal latest iterate, while in much of the 
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previous literature convergence rates in an average of primal iterates are given. This 
feature is of practical importance since usually the primal last iterate is employed in 
applications. More precisely, the main contributions in this chapter are: 

(i) We derive the computational complexity of the dual gradient method in terms of 
primal suboptimality and feasibility violation using inexact dual gradients and two 
types of approximate primal solutions: O log 7 ) in the primal last iterate and 
O (i log i) in an average of primal iterates, where e is some desired accuracy. 

(a) We also derive the computational complexity of the dual fast gradient method in 
terms of primal suboptimality and feasibility violation using inexact dual gradients 
and two types of approximate primal solutions: O log 7 ) in the primal last iterate 

and log in a primal average sequence. 

(Hi) Finally, we present an open-source optimization solver, termed DuQuad, con¬ 
sisting of the C-language implementations of the above inexact dual first order algo¬ 
rithms for solving convex quadratic problems, and we study its numerical behavior. 
Content. The chapter is organized as follows. In Section 2 we formulate our problem 
of interest and its dual, and we analyze its smoothness property. In Section 3 we in¬ 
troduce a general inexact dual first order method, covering the inexact dual gradient 
and fast gradient algorithms, and we derive computational complexity certificates 
for these schemes. Finally, in Section 4 we describe briefly the DuQuad toolbox that 
implements the above inexact algorithms for solving convex quadratic programs in 
C-language, while in Section 5 we provide detailed numerical experiments. 
Notation. We consider the space K” composed of column vectors. For x, y G R", 
we denote the scalar product by (x, y) = x^y and the Euclidean norm by ||x|| = 
Vx'^x. We denote the nonnegative orthant by R" and we use [u]+ for the projection 
of u onto R". The minimal eigenvalue of a symmetric matrix Q G R"^" is denoted 
by Amin(Q) and ||Q||_f denotes its Frobenius norm. 


2 Problem formulation 

In this section, we consider the following general constrained convex optimization 
problem: 

/* = min /(u) s.t.: g(u) < 0 , ( 1 ) 

where U C R" is a closed simple convex set (e.g. a box set), 0 G R^ is a vector of 
zeros, and the constraint mapping g(-) is given by g(-) = [ 5 i(')! • ■ ■ (The 

vector inequality g(u) < 0 is to be understood coordinate-wise.) The objective 
function /(•) and the constraint functions ... ,gp{-) are convex and differen¬ 
tiable over their domains. Many engineering applications can be posed as the gen¬ 
eral convex problem O- For example for linear model predictive control problem 
in condensed form ll^ fm[T7l[T8ll20l : / is convex (quadratic) function, U is box set 
describing the input constraints and g is given by convex functions describing the 
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State constraints; for network utility maximization problem m: / is log function, 
U = R” and g is linear function describing the link capacities; for DC optimal 
power flow problem ll22l : / is convex function, U is box set and g describes the DC 
nodal power balance constraints. 

We are interested in deriving computational complexity estimates of dual first order 
methods for solving the optimization problem ([Til. We make the following assump¬ 
tions on the objective function and the feasible set of the problem ([T]). 

Assumption 1 Let U C dom f n {r\i-idom gi}, and assume that: 

(a) The Slater condition holds for the feasible set of problem o, i.e., there exists 
u G relint{U) such that g(u) < 0 . 

(b) The function f is strongly convex with constant <7/ > 0 and has Lipschitz con¬ 
tinuous gradients with constant Lf, i.e.: 

yllu-vf < /(u)-(/(v) + (V/(v),u-v)) < ^||u-vf Vu,vG U. 

(c) The function g has bounded Jacobians on the set U, i.e., there exists Cg > 0 such 
f/iflf II Vg(u)IIF < Cg foralluGlI. 

Moreover, we introduce the following definition: 

Definition 1. Given e > 0, a primal point G U is called e-optimal if it satisfies: 

|/(w)-/*|<e and ||[g(ue)]_^|| < e. 

Since U is assumed to be a simple set, i.e. the projection on this set is easy (e.g. a 
box set), we denote the associated dual problem of dD as; 

max d{x.) ( =min£(u,x) ) , ( 2 ) 

x>0 Y uGU J 

where the Lagrangian function is given by: 

'C(u,x) = /(u) -b (x,g(u)). 

We denote the dual optimal set with X* = argmaxd(x). Note that Assump- 

x>0 

tionlTJa) guarantees that strong duality holds for ([T]i. Moreover, since / is strongly 
convex function (see Assumption[T[6)), the inner subproblem min£(u, x) has the 

uGC/ 

objective function £(•, x) strongly convex for any fixed x G It follows that the 
optimal solution u* of the original problem ([T]) and u(x) = argminugc/ C(u, x) 
are unique and, thus, from Danskin’s theorem m we get that the dual function d is 
differentiable on R" and its gradient is given by: 

Vd(x) = g(u(x)) forallxGM”. 

From Assumption me) it follows immediately, using the mean value theorem, that 
the function g is Lipschitz continuous with constant Cg, i.e.. 
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||g(u)-g(v)|| < Cg||u- v|| Vu,vef/. (3) 

In the forthcoming lemma, Assumption[T](&) and (c) allow us to show that the dual 
function d has Lipschitz gradient. Our result is a generalization of a result in IH 
given there for the case of a linear mapping g(-) (see also cni for a different proof): 

Lemma 1. Under Assumption\J] the dual function d{-) corresponding to general 
convex problem O has Lipschitz continuous gradient with constant Ld = c^jaf, 
i.e., 

||Vd(x) - Vd(x)|| < Cg/cr/||x-x|| Vx,xgR^. (4) 

Proof. Let x, x G Then, by using the optimality conditions for u(x) and u(x), 
we get: 


V/(u(x)) + x*V 5 *(u(x)), u(x) - u(x) \ > 0, 

\ i=l / 

/ P \ 

V,/(u(x)) +^x,;V 5 *(u(x)),u(x) - u(x) \ > 0. 


Adding these two inequalities and using the strong convexity of /, we further obtain 
cr/||u(x) - u(x)f < (V/(u(x)) - V/(u(x)), u(x) - u(x)) 

< ( ^XiVg*(u(x)) - ^x,V5i(u(x)),u(x) - u(x) \ 


= ( - x,;)Vp,(u(x))-j]]x,(Vpi(u(x)) - Vp,(u(x))), u(x) - u(x) 


\i=l 


where the last inequality follows from the convexity of the function gi and x^ > 0 
for all i. By the Cauchy-Schwarz inequality, we have 


c^/l|u(x) - u(x)f < ^|xi-x,|||V 5 i(u(x))||||u(x)-u(x)|| 

< ||x-x||||Vg(u(x))||F||u(x) -u(x)|| 

< Cg||x-x||||u(x) -u(x)||, 


where the second inequality follows by Holder’s inequality and the last inequality 
follows by the bounded Jacobian assumption for g (see AssumptionlTfc)). Thus, we 
obtain: 


l|u(x) 


u(x)|| 



X-X . 


6 


Ion Necoara, Andrei Patrascu and Angelia Nedic 


Combining (O with the preceding relation, we obtain that the gradient of the dual 
function is Lipschitz continuous with constant La = —, i.e., 


||Vd(x) - Vd(x)|| = ||g(u(x)) - g(u(x))|| < Cg||u(x) - u(x)|| < -^||x - x||, 
for all X, X € □ 

Note that in the case of a linear mapping g, i.e., g(u) = Gu + g, we have 
Cg = ||G||ir > ||G||. In conclusion, our estimate on the Lipschitz constant of the 
gradient of the dual function for general convex constraints = c^/o'f can coin¬ 
cide with the one derived in na for the linear case = ||Gp/crf if one takes the 
linear structure of g into account in the proof of Lemma [T] (specifically, where we 
used Holder’s inequality). Based on relation (|4|l of Lemma[Tl the following descent 
lemma holds with = (^g/of (see for example IfT^ l: 

d(x)>d(y) + (Vd(y),x-y)-^||x-yf Vx,yeR^. (5) 

Using these preliminary results, in a unified manner, we analyze further the compu¬ 
tational complexity of inexact dual first order methods. 


3 Inexact dual first order methods 

In this section we introduce and analyze inexact first order dual algorithms for solv¬ 
ing the general smooth convex problem ([T]i. Since the computation of the zero-th 
and the first order information of the dual problem (|2]i requires the exact solution of 
the inner subproblem min£(u, x) for some fixed x € R(j), which generally cannot 

be computed in practice. In many practical cases, inexact dual information is avail¬ 
able by solving the inner subproblem with a certain inner accuracy. We denote with 
u(x) the primal point satisfying the 5-optimality relations: 

u(x) G C/, 0 < £(u(x),x) — d(x) <5 Vx G (6) 

In relation with (|6]l, we introduce the following approximations for the dual function 
and its gradient: 

d(x) = £(u(x), x) and Vc?(x) = g(u(x)). 

Then, the following bounds for the dual function c?(x) can be obtained, in terms of 
a linear and a quadratic model, which use only approximate information of the dual 
function and of its gradient (see mni Lemma 2.5]): 

0 < (d(y)-t-(Vd(y),x-y)^ - (i(x) < Ld||x-y|p-b 35 Vx,yGM^. (7) 
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Note that if (5 = 0, then we recover the exact descent lemma Q. Before we introduce 
our algorithmic scheme, let us observe that one can efficiently solve approximately 
the inner subproblem if the constraint functions pi(-) satisfy certain assumptions, 
such as either one of the following conditions; 

(1) The operator g(-) is simple, i.e., given v € [/ and x S RP, the solution of the 
following optimization subproblem: 



can be obtained in linear time, i.e. 0{n) operations. An example satisfying this 
assumption is the linear operator, i.e. g(u) = Gu+g,whereG G RP^'^,g G 
(2) Each function gi(-) has Lipschitz continuous gradients. 

In such cases, based on Assumption[TJb) (i.e. / has Lipschitz continuous gradient), it 
follows that we can solve approximately the inner subproblem min £(u, x), for any 


fixed X G K^J., with Nesterov’s optimal method for convex problems with smooth 
and strongly convex objective function CSl. Without loss of generality, we assume 
that the functions pi(-) are simple. When g(-) satisfies the above condition (2), there 
are minor modifications in the constants related to the convergence rate. Given x G 
the inner approximate optimal point u(x) satisfying £(u(x),x) — d(x) < 6 
is obtained with Nesterov’s optimal method IfTSlI after Ns projections on the simple 
set U and evaluations of V/, where 



( 8 ) 


with Rp{x) = ||v° — u(x)||, and v° being the initial point of Nesterov’s optimal 
method. When the simple feasible set U is compact with a diameter Rp (such as for 
example in MFC applications), we can bound i?p(x) uniformly, i.e.. 


-Rp(x) < Rp Vx G 


In the sequel, we always assume that such a bound exists, and we use warm- 
start when solving the inner subproblem. Now, we introduce a general algorithmic 
scheme, called Inexact Dual First Order Method (IDFOM), and analyze its conver¬ 
gence properties, computational complexity and numerical performance. 

Algorithm IDFOM 


Given y° G 5 > 0, for fc > 0 compute: 

1. Find \i^ gU such that £(u'"', y'"') — d(y^) < 5, 

2. Update x'"'= y^' +^Vd(y^') 
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where = u(y^'), Vd{y^) = g(u^) and the selection of the parameter 9k is 
discussed next. More precisely, we distinguish two particular well-known schemes 
of the above framework; 


• IDGM; by setting 9k = 0 for all fc > 0, we recover the Inexact Dual Gradient 
Method since = x^. For this scheme, we define the dual average sequence 

k 

x^' = ■ We redefine the dual final point (the dual last iterate x^ when 

3=0 


+ 2i:Vd(x'=) 


j + 


. Thus, all 


some stopping criterion is satisfied) as x^ = 

the results concerning x^ generated by the algorithm IDGM will refer to this 
definition. 

• IDFGM: by setting 9k = for all k > 0, we recover the Inexact Dual Fast 
Gradient Method. This variant has been analyzed in | 


Note that both dual sequences are dual feasible, i.e., x^, y^ G for all k > 0, and 
thus the inner subproblem min £(u, y^) has the objective function strongly convex. 

Towards estimating the computational complexity of IDFOM, we present an unified 
outer convergence rate for both schemes IDGM and IDFGM of algorithm IDFOM 
in terms of dual suboptimality. The result has been proved in l|3][Tol . 


Theorem I. Given <5 > 0, let {(x^, y^)}/c>o be the dual sequences gener¬ 

ated by algorithm IDFOM. Under Assumption\I\ the following relation holds: 


r 




Vfc > 1, 


where p{9) = I ’ ^ ^ ^ and Rd = ruin ||y° —x*||. 

12 , if0k = ^, 

Proof. Firstly, consider the case 9k = 0 (which implies p{9) = 1). Note that the 
approximate convexity and Lipschitz continuity relations (|7]i lead to: 

d(x'=) > J(x'=) + (Vd(x'=), x'^ - - Td||x^ - x'^f - 3b 

>J(x'=)+Td||x'=-x'=f-3b 

0 

> d(x'=) - 3b, (9) 

where in the second inequality we have used the optimality conditions of x^ = 
[x^ + ^ ^he other hand, using |[3] Theorem 2], the following 

convergence rate for the dual average point x^ can be derived: 

/*-d(x'=) < ^^ + b Vfc>l. (10) 

Combining (|9|l and (fTOl i we obtain the first case of the theorem. The second case, 
concerning 9k = has been shown in ll^ fTOl . □ 
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Our iteration complexity analysis for algorithm IDFOM is based on two types of 
approximate primal solutions: the primal last iterate sequence (v^')fc>o defined by 
= u(x^') or a primal average sequence (u^')fc>o of the form: 



k 



( 11 ) 


k 


(fe+i)W) E(j + l)u^ ifIDFGM. 

3=0 


Note that for algorithm IDGM we have = u^', while for algorithm IDFGM 
^ yfc Without loss of generality, for the simplicity of our results, we assume: 

y°=0, i?d > max , La > 1- (12) 

I Cg Cg j 

If any of these conditions do not hold, then all of the results from below are valid 
with minor variations in the constants. 


3.1 Computational complexity of IDFOM in primal last iterate 

In this section we derive the computational complexity for the two main algorithms 
within the framework of IDFOM, in terms of primal feasibility violation and primal 
suboptimality for the last primal iterate v^' = u(x^'). To obtain these results, only 
in this section, we additionally make the following assumption: 

Assumption 2 The primal set U is compact, i.e. max ||u — v|| = Rp < oo. 

u.vGt/ 

Assumption |2] implies that the objective function / is Lipschitz continuous with 
constant Lf, where Lf = max ||V/(u)||. Now, we are ready to prove the main 

\i£U 

result of this section, given in the following theorem. 

Theorem 2. Let e > Obe some desired accuracy and = u(x^) be the primal last 

iterate generated by algorithm IDFOM. Under Assumptions\I\and^ by setting: 



(13) 



the following assertions hold: 


.2\2/p(e) 


I 6L \ ' > 

[i) The primal iterate is e-optimal after a ( —^ J outer iterations. 

(a) Assuming that the primal iterate is obtained with Nesterov’s optimal method 


mi applied to the subproblem min £(u, x^), then is e-optimal after 









10 


Ion Necoara, Andrei Patrascu and Angelia Nedic 







log 


6 ^^ 


+ log 


[ L,Rl J 


total number of projections on the primal simple set U and evaluations ofV f. 

Proof, ii) From Assumption the Lagrangian £(u, x) is cr^-strongly convex 

in the variable u for any fixed x S which gives the following inequality m- 

£(u,x) > d(x) + ^||u(x) - u|p Vu e f/,X G . (14) 

Moreover, under the strong convexity assumption on / (cf. Assumption ([I])(&))j the 
primal problem ([T]i has a unique optimal solution, denoted by u*. Using the fact that 
(x,g(u*)) < 0 for any x > 0, we have; 

£(u*,x) - d(x) = /(u*) + (x,g(u*)) - d(x) </* - d(x) Vx G R^j.. (15) 

Combining (fTSl l and (fT4l) we obtain the following relation 

^||u(x) - u*||2 </* - d(x) VxGR?;., (16) 

which provides the distance from u(x) to the unique optimal solution u*. 

On the other hand, taking u = u(x) in (fT4l) and using (|6]l, we have: 

||g(u(x)) - g(u(x))|| < Cg||u(x) - u(x)|| ^/2LdS, (17) 

where we used that Lj = From (fThl l and (fTTl i. we derive a link between the 

primal infeasibility violation and dual suboptimality gap. Indeed, using the Lipschitz 
continuity property of g, we get: 


||g(u(x)) - g(u*)|| < ||g(u(x)) - g(u(x))|| + ||g(u(x)) - g(u*)|| 

< s/2L^ + y/2Ld{f* — c?(x)) Vx G M+. 

Combining the above inequality with the property g(u*) < 0, and the fact that for 
any a G R^ and b G R+ we have ||a + b|| > ||[a]+|l, we obtain: 


||[g(u(x))]^|| < a/ 2 Z^+A/ 24 ^( 7 F^^d(x)) VxGR^. (18) 

Secondly, we find a link between the primal and dual suboptimality. Indeed, using 
(x*, g(u*)) = 0, we have for all x G R+: 

f* = (x*,g(u*)) + /(u*) = min{/(u) + (x*,g(u))} < /(u(x)) + (x*, g(u(x))). 

uGL' 


Further, using the Cauchy-Schwarz inequality, we derive: 
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/(u(x)) - /* > -||x*||||g(u*) - g(u(x))|| 

> -Rd (\/iM + \/2Ld(/* - d(x))) Vx G RP . (19) 

On the other hand, from Assumption|2] we obtain; 


/(u(x)) - f* < i/||u(x) - U*|| < Lf (||u(x) - u(x)|| + ||u(x) 



Lf 





( 20 ) 


Taking x = x^' in relation ( fTSl l and combining with the dual convergence rate from 
Theorem[T] we obtain a convergence estimate on primal infeasibility: 


[g(v'=)]. 


< 


‘2,LiRd 

kpWU^ 




Letting x = x^ in relations (fl^ and (|20] | and combining with the dual convergence 
rate from Theorem[Tl we obtain convergence estimates on primal suboptimality: 


‘^LgRl 

}^p(0)/2 


‘IL jCgRd 


{2LdRlsf^<fiv'^)-r 

1/2 

+ 


+ Lf 


\ o-/ 



1/2 


Enforcing to be primal e-optimal solution in the two preceding primal conver¬ 
gence rate estimates, we obtain the stated result. 

{a) At each outer iteration fc > 0, by combining the bound (fOl l with the inner 
complexity (|8ll, Nesterov’s optimal method ifTSll for computing v*’ requires: 






P.aP(0)-^ 




projections on the set U and evaluations of V/. Multiplying with the outer com¬ 
plexity given in part (i), we obtain the result. □ 


Thus, we obtained computational complexity estimates for primal infeasibility and 
suboptimality for the last primal iterate v^' of order log 2) for the scheme 
IDGM and of order 0(2 log 2) for the scheme IDFGM. Furthermore, the inner 
subproblem needs to be solved with the inner accuracy 5 of order O(e^) for IDGM 
and of order O(e^) for IDFGM in order for the last primal iterate = u(x^) to 
be an e-optimal primal solution. 
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3.2 Computational complexity ofIDFOM in primal average iterate 

In this section, we analyze the computational complexity of algorithm IDFOM in 
the primal average sequence defined by (fTTT) . Similar derivations were given 
in Qo). For completeness, we also briefly review these results. Since the average 
sequence is different for the two particular algorithms IDGM and IDFGM, we pro¬ 
vide separate results. First, we analyze the particular scheme IDGM, i.e., in IDFOM 
we choose 0^ = 0 for all fc > 0. Then, we have the identity = x^' and do not 
assume anymore the redefinition of the last point = [x^ + 2 ^V(i(x^)]_|_, i.e., 
algorithm IDGM generates one sequence {x*} using the classical gradient update. 

Theorem 3. Let e > 0 and = u(x^) be the primal sequence generated by the 
algorithm IDGM (i.e. 0k = 0 for all k > Oj. Under Assumption\I] by setting: 

5 <- ( 21 ) 

_ 3 


the following assertions hold: 


(i) The primal average sequence given in (II lb is e-optimal after 




outer 


iterations. 

(a) Assuming that the primal iterate = u(x^) is obtained by applying Nesterov’s 
optimal method 4751? to the subproblem min 73(u, x^), the primal average iterate 

uG£/ 


is e-optimal after: 


8 


-) 


1/2 




log 


LfR\ 


total number of projections on the primal simple set U and evaluations ofV f. 
Proof, (i) Using the definition of x^+^, we have; 


— X"' = 


^ — x-^ + —r—'V d(pU) 


— Vj > 0. 


-I -t 


Summing up the inequalities for j = 0,... ,k and dividing by k, implies: 

k r 


2Li 


fc + 1 

2Ld 


■fc+l _ ^ 


(xfc+1 _ xO) = 


fc + 1 


\j=0 


1 


Vd(x^) 


— x-' 


J -I- 


fc + 1 


k r 

E 

7=0 


x-^ H-Vc?(x'^) 

2Ld 


J + 


2Td 


Vd(x^) 


fc + 1 




7=0 


Using the fact that Vd(x7) = g(u7), the convexity of g and denoting z-l = 


+ 2 i;Vd(x^ )] ^ - (x^- + 2 i;Vd(x^ )) e K^, we get; 
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j=o 

Note that if a vector pair (a, b) satisfies a < b, then [a]+ < [b]+ and ||[a] + || < 
||[b] + ||. Using these relations and the fact that > 0, we obtain the following 
convergence rate on the feasibility violation; 


[g(u'')]. 


< 


< 







/ 2Ld 

[x'=+i-x°]^ 


“ fc + 1 

-1 + 




2Ud||x^'+i-x°|| 
k + I 


( 22 ) 


On the other hand, from ifTOl Theorem 3.1], it can be derived that: 
llx-’+^-xlpS ||x^-x|p-^(Vd(x'’),x-X'’) + ^^d(x^+^)-d(x'’) + 3(5^ , (23) 


for all X > 0 and j > 0. Using 0, i.e. d(x) < d(x^) + (Vd(x^), x — x-^), taking 
X = X*, using d(x'^+^) < (i(x*) and summing over j from j = 0 to fc, we obtain: 


xfe+i _xU| < ||x°-x*| 


'35(fc + l) 


Td 


(24) 


Combining the estimate for feasibility violation and (l24l) . we finally have; 

2Ld(||x°-x*|| + ||x''+^-x*||) 4Ld||x°-x*|| / 3id^ 


[g(u")]. 


<■ 


■ < 


-+2 


fc + 1 


• (25) 


fc + 1 fc + 1 

In order to obtain a sublinear estimate on the primal suboptimality, we write: 

/* =min{/(u) + (x*,g(u))} < /(u'=) + (x*,g(u'“')) < /(u'^') + (x*, [g(u'“')]_^) 

, /D. , ii„0in (25) 


</(u'=) + ||x*|| [g(u'=)], </(u'=) + (i?d + ||x°||) [g(u'=)] 


On the other hand, taking x = 0 in ( l2Tt and using the definition of d{x !>), we obtain: 

llx-^+^lP < ||x-^|p + ^(Vd(x'^),x^) + ^(d(x^+^)-/(u'^) - (x^ Vd(x'^))+3b) 


< ||x^f + —(r-/(u^) + 3<5). 

Using an inductive argument, the convexity of / and the definition of u^, we get; 


-r< + 3^- 


(27) 
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Using the assumption x° = 0, from (|25]) . (|2^ and dZTl i. we get; 



From assumptions on the constants Rd, and <5 (see dUll and (EB), our first result 
follows. 

{a) Taking into account the relation (l2Tli on 6, the inner number of projections on 
the simple set U at each outer iteration is given by: 



Multiplying with the outer complexity obtained in (i), we get the second result. □ 

Further, we study the computational complexity of the second particular algorithm 
IDFGM, i.e. the scheme IDFOM with dk = Note that in the framework 
IDFOM both sequences {x^}fe>o and {y^}fc>o are dual feasible, i.e. are in ]R?|.. 
Based on lHH Theorem 2] (see also lIMT^ 'l. when 9^ = we have the following 
inequality which will help us to establish the convergence properties of the particular 
algorithm IDFGM: 


(k + l)(fc + 2) (fc + 1)(A: + 2)(fc + 3) 


'28) 



We now derive complexity estimates for primal infeasibility and suboptimality of 
the average primal sequence {u^}fe>o as defined in (fTTI) for algorithm IDFGM. 


Theorem 4. Let e > 0 and u^' = u(y^) be the primal sequence generated by 




(29) 


the following assertions hold: 


(i) The primal average iterate given in (II lb is e-optimal after 



outer iterations. 

{a) Assuming that the primal iterate = u(y^) is obtained by applying Nesterov’s 
optimal method 4751? to the subproblem min £(u, y^), the average primal iterate 


if is e-optimal after: 
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\ £3/2 j 


total number of projections on the primal simple set U and evaluations ofVf. 
Proof, ii) For primal feasibility estimate, we use (l28l l and the convexity of / and g: 


™>o (~ + l|x - y°f + < d(x'') -/(u'=) + (fc + 3),5. (30) 

For the right hand side term, using the strong duality and x* > 0, we have; 
d{^^) - < d{K*) - /(u'=) = min{/(u) + (x*,g(u))} - /(u'=) 

<(x*,g(u'=))<(x*,[g(u'^)] + ). (31) 


By evaluating the left hand side term in ( l30l ) at x = [g(u^)]+ and observing 

that ([g(u^')] + , g(u^) — [g(u^)]+) = 0 we obtain: 


max I — 

x>0 


4Ld 


(k + iy 


x-y^r + (x,g(u'')) 


(32) 


> ^^ 42 )!||[g(u‘)] +(/. [g(u*)] ), 


16Ld 


(fc+l )2 


Combining dlTT i and (1^ with d^ . using the Cauchy-Schwarz inequality and nota¬ 
tion 7 = ||[g(u^)] + || we obtain: 


(k + l)^ 2 /, , oxx II , Oil., 4Ld||y' 


, 0||2 


16Ld 


- 7 "-(fc + 3 ) 5 -||x*-y^|| 7 - 


(fe + 1) 


2 — 


< 0 . 


Thus, 7 must be less than the largest root of the second-order equation, from which, 
together with the definition of Rd we get; 


ll[g(fi'=)] + ll< 


16Ldi?d /3Ld<5 

(fc-f 1)2 A:-f 1 ■ 


(33) 


For the left hand side on primal suboptimality, using x* > 0, we have: 

f* = min{/(u) -f (x*,g(u))} < f{u'^) + (x*,g(u'')) 

u£U 

< /(u^) + (x*, [g(u^)]+) < /(u'=) + i?d||[g(u'“')] + ||. 

Using (I 33 I 1 . we derive an estimate on the left hand side primal suboptimality: 


-r< 


16Ldi?g 

(fc+l )2 


+ 4i?d 



(34) 
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On the other hand, taking x = 0 in (l30l) and recalling that y° = 0, we get; 



||x-yOf + (x,g(u'=))) +(A; + 3)<5 


(35) 


Moreover, taking into account that d{x.^) < /*, from (l54l i and dlST l we obtain: 



(36) 


From the convergence rates (l3^ and (l3^ we obtain our first result. 

{ii) Substitution of the bound (|29] | into the inner complexity estimate © leads to: 



projections on U and evaluations of V/ for each outer iteration. Multiplying with 
the outer complexity estimate obtained in part (*), we get our second result. □ 

Thus, we obtained computational complexity estimates for primal infeasibility and 
suboptimality for the average of primal iterates u^' of order O (\ log \) for the 
scheme IDGM and of order log i) for the scheme IDFGM. Moreover, the 
inner subproblem needs to be solved with the inner accuracy 5 of order 0(e) for 
IDGM and of order 0{ey/e) for IDFGM so that to have the primal average se¬ 
quence as an e-optimal primal solution. Further, the iteration complexity esti¬ 
mates in the last primal iterate are inferior to those estimates corresponding to an 
average of primal iterates u^. However, in practical applications we have observed 
that algorithm IDFOM converges faster in the last primal iterate than in the primal 
average sequence. Note that this does not mean that our analysis is weak, since we 
can also construct problems which show the behavior predicted by the theory. 


4 DuQuad toolbox 

In this section, we present the open-source solver DuQuad based on C-language 
implementations of the framework IDFOM for solving quadratic programs (QP) 
that appear in many applications. For example linear MPC problems are usually 
formulated as QPs that need to be solved at each time instant for a given state. Thus, 
in this toolbox we considered convex quadratic programs of the form: 


min f(u) 
uett ^ ^ 



s.t. : Gu + g < 0, (37) 
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where Q 0, G G and U C R" is a simple compact convex set, i.e. 

a box U = [lb ub]. Note that our formulation allows to incorporate in the QP 
either linear inequality constraints (arising e.g. in sparse formulation of predictive 
control and network utility maximization) or linear equality constraints (arising e.g. 
in condensed formulation of predictive control and DC optimal power flow). In fact 
the user can define linear constraints of the form: lb < Gu+g < ub and depending 
on the values for lb and ub we have linear inequalities or equalities. Note that 
the objective function of dJTl i has Lipschitz gradient with constant Lf = Amax(Q) 
and its dual has also Lipschitz gradient with constant . Based on the 

scheme IDFOM, the main iteration in DuQuad consists of two steps: 

Step 1: for a given inner accuracy 5 > 0 and a multiplier x G R[J., we solve ap¬ 
proximately the inner problem with accuracy 6 to obtain an approximate solution 
u(x) instead of the exact solution u(x), i.e.: £(u(x),x) — d{x.) < S. In DuQuad, 
we obtain an approximate solution u(x) using Nesterov’s optimal method ifTSll and 
warm-start. 

Step 2: Once a b-solution u(x) for inner subproblem was found, we update at the 
outer stage the Lagrange multipliers using the scheme IDFOM, i.e. for updating the 
Lagrange multipliers we use instead of the true value of the dual gradient V(i(x) = 
Gu(x) -f g, an approximate value Vd(x) = Gu(x) g. 


MatHab 



Fig. 1 DuQuad workflow. 


An overview of the workflow in DuQuad lb) is illustrated in Fig. [T] A QP problem 
is constructed using a Matlab script called test.m. Then, the function duquad.m is 
called with the problem data as input and it is regarded as a preprocessing stage for 
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the online optimization. The binary MEX file is called, with the original problem 
data and the extra information as an input. The main.c file of the C-code includes 
the MEX framework and is able to convert the MATLAB data into C format. Eur- 
thermore, the converted data gets bundled into a C “struct” and passed as an input 
to the algorithm that solves the problem using the two steps as described above. 

In DuQuad a user can choose either algorithm IDFGM or algorithm IDGM for 
solving the dual problem. Moreover, the user can also choose the inner accuracy 5 
for solving the inner problem. In the toolbox the default values for 5 are taken as 
in Theorems 12] |3] and |4] Erom these theorems we conclude that the inner QP has to 
be solved with higher accuracy in dual fast gradient algorithm IDFGM than in dual 
gradient algorithm IDGM. This shows that dual gradient algorithm IDGM is robust 
to inexact information, while dual fast gradient algorithm IDFGM is sensitive to 
inexact computations, as we can also see from plots in Fig.|2| 




Fig. 2 Sensitivity of IDGM (left) and IDFGM (right) in the average of iterates in terms of primal 
suhoptimality w.r.t. different values of the inner accuracy <5 for a QP (n = 50 and p = 75) with 
desired accuracy e = 0.01. 


Let us analyze now the computational cost per inner and outer iteration for algorithm 
IDFOM for solving approximately the original QP problem (iTTl i: 

Inner iteration: When solving the inner problem with Nesterov’s optimal method 
ini, the main computational effort is done in computing the gradient of the La- 
grangian V£(u, x) = Qu + q + G^x. In DuQuad these matrix-vector operations 
are implemented efficiently in C (the matrices that do not change along iterations 
are computed once and only G^x is computed at each outer iteration). The cost 
for computing V£(u, x) for general QPs is 0{n?). However, when the matrices Q 
and G are sparse (e.g. network utility maximization problem) the cost 0(71^) can 
be reduced substantially. The other operations in algorithm IDFOM are just vector 
operations and, hence, they are of order 0{n). Thus, the dominant operation at the 
inner stage is the matrix-vector product. 

Outer iteration: The main computational effort in the outer iteration of IDFOM is 
done in computing the inexact gradient of the dual function: Vc?(x) = Gu(x) 4 - g. 















Complexity certifications of first order inexact dual methods 


19 


The cost for computing Vd{x) for general QPs is 0{np). However, when the matrix 
G is sparse, this cost can be reduced. The other operations in algorithm IDFOM are 
of order 0{p). Hence, the dominant operation at the outer stage is also the matrix- 
vector product. 

Fig.[3]displays the result of profiling the code with gprof. In this simulation, a stan¬ 
dard QP with inequality constraints, and with dimensions n = 150 and p = 225 
was solved by algorithm IDFGM. The profiling summary is listed in the order of 
the time spent in each file. This figure shows that most of the execution time of the 
program is spent on the library module math-functions.c. More exactly, the domi¬ 
nating function is mtx-vec-mul, which multiplies a matrix with a vector. 


Name(locdtion) 

SampI 

Calls 

Time/Call 

%Time 

’ Summary 

154 



'ioo.o% [ 

► dfgm.c 

0 



0.0% 

► fgm.c 

1 



0.65% 

► general_functions.c 

0 



0.0% 

► init_problem.c 

0 



0.0% 

► main.c 

0 



0.0% 

^ math_functions.c 

153 



99.35% 

abs_2 

0 

12983 

Ons 

0.0% 

mtx_transpose 

0 

1 

0ns 

0.0% 

mtx vec mul 

142 

26590 

53.403US 

(92.21% 1 

obj 

0 

13234 

Ons 

0.0% 

► vector_add 

2 

26464 

755ns 

1.3% 

► vector_copy 

1 

27464 

364ns 

0.65% 

vector_max 

0 

12860 

Ons 

0.0% 

vector_max_with_zero 

0 

372 

Ons 

0.0% 

► vector_min 

1 

12860 

777ns 

0.65% 

► vector_mul 

2 

26718 

748ns 

1.3% 

vector_norm_2 

0 

124 

Ons 

0.0% 

► vector_scalar_mul 

3 

26215 

1.144US 

1.95% 

► vector_sub 

2 

26960 

741ns 

1.3% 


Fig. 3 Profiling the code with gprof. 


In conclusion, in DuQuad the main operations are the matrix-vector products. 
Therefore, DuQuad is adequate for solving QP problems on hardware with limited 
resources and capabilities, since it does not require any solver for linear systems or 
other complicating operations, while most of the existing solvers for QPs from the 
literature (such as those implementing active set or interior point methods) require 
the capability of solving linear systems. On the other hand, DuQuad can be also 
used for solving large-scale sparse QP problems since, in this case, the iterations are 
computationally inexpensive (only sparse matrix-vector products). 


5 Numerical simulations with DuQuad 

For numerical experiments, using the solver DuQuade we at first consider ran¬ 
dom QP problems and then a real-time MPC controller for a self balancing robot. 
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5.1 Random QPs 

In this section we analyze the behavior of the dual first order methods presented in 
this chapter and implemented in DuQuad for solving random QPs. 



Fig. 4 Number of outer iterations on random QPs for IDGM and IDFGM in primal last/average 
of iterates for different test cases of the same dimension (left) and of variable dimension (right). 


In Fig. |4]we plot the practical number of outer iterations on random QPs of algo¬ 
rithms IDGM and IDFGM for different test cases of the same dimension n = 50 
(left) and for different test cases of variable dimension ranging from n = 10 to 
n = 500 (right). We have choosen the accuracy e = 0.01 and the stopping criteria 
is the requirement that both quantities 

|/(u)-/*| and ||[Gu-)-g] + || 

are less than the accuracy e, where /* has been computed a priori with Matlab quad- 
prog. From this figure we observe that the number of iterations is not varying much 
for different test cases and, also, that the number of iterations is mildly dependent on 
the problem’s dimension. Finally, we observe that dual first order methods perform 
usually better in the primal last iterate than in the average of primal iterates. 


5.2 Real-time MFC for balancing robot 

In this section we use the dual first order methods presented in this chapter and 
implemented in DuQuad for solving a real-time MPC control problem. 

We consider a simplified model for the self-balancing Lego mindstorm NXT ex¬ 
tracted from II 2 TII . The model is linear time invariant and stabilizable. The contin¬ 
uous linear model has the states x € and inputs u € R. The states for this 
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angles inputs 




t t 

Fig. 5 The MFC trajectories of the state angle (left) and input (right) for Af = 10 obtained using 
algorithm IDGM from DuQuad in the last iterate with accuracy e = 10“^. 


system are the horizontal position and speed (h, h), and the angle to the vertical and 
the angular velocity of the robot’s body (9, 9). The input for the system represents 
the pulse-width modulaed voltage applied to both wheel motors in percentages. We 
discretize the dynamical system via the zero-order hold method for a sample time 
of T = Sms to obtain the system matrices: 



'1 0.0054 -2 • 10"^ lO-'^ ■ 


■ 0.0002 ■ 

A = 

0 0.4717 -0.0465 0.0211 

, B = 

0.0448 

0 0.03 1.0049 0.0068 

-0.0025 


0 6.0742 1.0721 0.7633 


-0.5147 


For this linear dynamical system we consider the duty-cycle percentage constraints 
for the inputs, i.e. —12 < u{t) < 12, and additional constraints for the position, i.e. 
—0.5 < h < 0.5, and for the body angle in degrees, i.e. —15 < 0 < 15. For the 
quadratic stage cost we consider matrices: Q = diag([l 1 6 • 10^ 1]) and R = 2. 
We consider two condensed MFC formulations: MPC smooth and MPCpenalized, 
where we impose additionally a penalty term /3{u{t) — u{t — 1))^, with /3 = 0.1, 
in order to get a smoother controller. Note that in both formulations we obtain QPs 
im. Initial state is x = [0 0 0.5 — 0.35]^ and we add gentle disturbances to the 
system at each 20 simulation steps. In Fig.|5]we plot the MPC trajectories of the state 
angle and input for a prediction horizon = 10 obtained using algorithm IDGM 
in the last iterate with accuracy e = 10“^. Similar state and input trajectories are 
obtained using the other versions of the scheme IDFOM from DuQuad. We observe 
a smoother behavior for MPC with the penalty term. 
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